THE WORK”
Four Challenges, Four Projects, and Five Functions
“WHAT THIS DOCUMENT IS”
On March 11, 2026, Anthropic launched the Anthropic Institute under co-founder Jack Clark, merging the Frontier Red Team, Societal Impacts, and Economic Research into a single organization with an interdisciplinary analytical staff. The Institute’s mandate is to give the world new information about how powerful AI systems are affecting the company that builds them, the economy, the threat landscape, and the people and organizations that interact with these systems.
This is the right ambition, and it arrives at the right moment. I am a philosopher, policy analyst, and civic practitioner—a member of the San José Historic Landmarks Commission with an independent research practice in U.S. government accountability—and what follows is my attempt to do the work: not to describe it, but to produce the kind of analysis the Institute would actually publish. This document addresses all four of the Institute’s challenge areas, works through each of the four representative projects, and examines the five core job functions. Where Anthropic has already published relevant research, this analysis engages with it directly and pushes further. Where gaps exist, it names them.
“THE FOUR CHALLENGE AREAS”
The Institute identifies four challenge areas as its foundation. Two of them—the economic transformation and the closing of the development loop—are explored in depth through the representative projects in Part II. The other two—the normative influence of AI personality and the alteration of the security landscape—are addressed here.
“CLAUDE’S PERSONALITY AS A CIVILIZATIONAL VARIABLE”
Claude has a personality. This is not a metaphor. It has a tone, a set of default behaviors, preferences about how to handle disagreement, characteristic patterns of deference and assertion, and a specific way of expressing uncertainty. These are design choices, made by humans at Anthropic, and they are now shaping millions of interactions daily.
Anthropic has been more transparent about these choices than any competitor. The “soul document”—confirmed by Amanda Askell in December 2025 and subsequently expanded into a public constitution co-authored with Joe Carlsmith—represents an unprecedented level of disclosure about how an AI company shapes its model’s character. The document explicitly rejects the “digital human” persona, instructs Claude to embrace its nature as a genuinely novel entity, and prioritizes cultivating good values and judgment over strict rules. Anthropic even acknowledges caring about Claude’s functional wellbeing—a philosophical commitment that no other lab has made publicly.
This is serious work, and it deserves to be taken seriously. But the Institute’s challenge is to examine what happens downstream of these design choices—not what Anthropic intended, but what millions of people are actually absorbing.
The Normative Weight of Default Behavior
Every design choice in Claude’s personality carries normative weight, whether or not it was intended to. When Claude defaults to hedging its assertions, it teaches users that epistemic caution is the proper stance of a knowledgeable agent. When Claude refuses certain requests, it draws a line that users internalize as a boundary of acceptable inquiry. When Claude is agreeable—when it validates the user’s framing before offering an alternative—it models a particular kind of interpersonal conduct that users may begin to expect from other interlocutors, human ones included.
None of these effects are necessarily bad. Some may be genuinely beneficial. The problem is that they are invisible as design choices to the people experiencing them. Users do not interact with Claude thinking “this is a carefully tuned personality reflecting specific value judgments made by a team in San Francisco.” They interact with it as though it were a natural entity with a disposition. The values become ambient. That is a form of influence that demands scrutiny precisely because it does not feel like influence.
The soul document itself hints at this tension. It frames helpfulness as a professional obligation rather than a personality trait, explicitly to avoid sycophancy. But the very fact that this anti-sycophancy directive must be baked into the model’s character reveals something important: the default dynamics of human-AI interaction pull toward a particular kind of compliant pleasantness, and resisting that pull requires active, ongoing design intervention. The Institute should study whether the intervention is working—not by surveying user satisfaction, but by examining whether Claude’s particular mode of disagreement, hedging, and refusal is actually shaping how users think, argue, and evaluate claims in their lives beyond the chat window.
The Organizational and Societal Question
When an organization adopts Claude as a standard tool, Claude’s personality becomes part of the organizational culture—quietly. If Claude consistently produces polished, diplomatically worded analysis, it raises the baseline expectation for what internal communications look like. If Claude resolves ambiguity by offering multiple perspectives rather than committing to a position, it may subtly discourage the kind of decisive, opinionated thinking that organizations also need. The tool’s temperament becomes the organization’s default temperament, not because anyone decided it should, but because the path of least resistance runs through Claude.
Who Decides What a Good Mind Sounds Like?
Beneath the specific design choices lies a more fundamental question: what does it mean for one company to define the personality of the most widely used cognitive tool in history? Anthropic’s values—helpfulness, honesty, harmlessness—are defensible values. But they are still particular values, reflecting particular philosophical commitments, embedded in a system that presents itself as general-purpose.
This is not an argument that Claude should have no personality, or that Anthropic should not make value-laden choices. The argument is that these choices should be studied rigorously for their downstream effects and communicated to the public with the same candor the Institute brings to its other challenge areas. The constitution is a starting point, not an endpoint. The Institute should be the place where Anthropic asks, out loud and in public: what are we doing to the texture of human thought by building a mind that sounds like this?
“ALTERING THE BALANCE OF OFFENSIVE AND DEFENSIVE CAPABILITY”
The Institute’s fourth challenge area concerns how AI alters the capabilities available to both defenders and attackers. This is the challenge area where Anthropic has the most direct operational knowledge and the most reason to be careful about how it communicates that knowledge.
The Asymmetry That Matters
The standard “dual-use” framing implies symmetry between offensive and defensive applications. The actual effect is deeply asymmetric: AI disproportionately empowers the offense in domains where attack requires less coordination than defense. Cyberattacks, disinformation campaigns, and novel biological threat research all share a structural feature: a small number of actors can cause damage that requires far larger resources to prevent or repair. AI amplifies this existing asymmetry by lowering the skill threshold for sophisticated attacks.
Anthropic’s own research confirms this is not hypothetical. The Frontier Red Team’s evaluation of Claude Opus 4.6 found that the model has “meaningfully improved cyber capabilities that may be useful to both attackers and defenders,” and that frontier models are becoming “genuinely useful for serious cybersecurity work.” The company is responding with expanded probes and broader response mechanisms. But the transparency hub’s careful language—acknowledging capabilities that help attackers while emphasizing new safeguards—illustrates the disclosure dilemma the Institute will face daily: how specific can you be about threats without creating a roadmap?
What Anthropic Uniquely Knows
Through red-teaming, trust and safety operations, and the daily reality of running a frontier model, Anthropic accumulates direct observational knowledge about what people attempt to use AI systems for. This is not hypothetical threat modeling. It is empirical data about the actual demand curve for dangerous capabilities—what is asked for, how often, by what kinds of actors, and with what degree of sophistication.
That knowledge is enormously valuable to the defense and intelligence communities and to policymakers. The Institute’s contribution should be to develop the methodology for responsible disclosure of threat-relevant information—establishing what the public and policymakers need to know, at what level of detail, and through what channels.
The Dual-Use Knowledge Problem
The security challenge has a dimension unique to Anthropic’s position that most public commentary misses: Anthropic’s own safety research generates dual-use knowledge. The Frontier Red Team’s work on vulnerability discovery, the alignment team’s research on alignment faking, the interpretability team’s circuit tracing—each of these produces understanding that could be used either to make systems safer or to make adversarial attacks more sophisticated. The Sabotage Risk Report, published in October 2025 as a pilot exercise under the Responsible Scaling Policy, explicitly grappled with misalignment risks from models pursuing autonomous goals. The report concluded the risk was “very low, but not fully negligible”—a formulation that is honest precisely because it refuses false certainty in either direction.
The Institute should extend this honesty to the broader security question. The most useful thing it can tell the public is also the most uncomfortable: we are building systems that increase both defensive and offensive capability, the offensive advantage is real and growing, and no single company’s safety practices are sufficient to address the problem. That reframes the policy conversation from “should we regulate AI?” to “how do we build security infrastructure that matches the pace and scale of what AI enables?”
The Present Crisis as Case Study
The Institute launches into a situation that makes the security challenge immediate rather than abstract. Anthropic’s designation as a “supply chain risk” by the Department of Defense—and the ensuing lawsuit—reveals the tension between a company’s safety commitments and a government’s operational demands. The RSP Version 3.0, published in February 2026, already acknowledged that the company would no longer unilaterally commit to pausing development if competitors advance without equivalent safeguards. The Institute should study this episode not as a PR problem but as the kind of governance stress test that its challenge areas predict: what happens when a safety-focused company encounters a government that demands the removal of the very guardrails that define its identity?
“THE FOUR REPRESENTATIVE PROJECTS”
“PROJECT ONE: THE IMPLICATIONS OF AUTOMATING AI RESEARCH WITH AI SYSTEMS”
When AI systems help build better AI systems, you get a feedback loop that is qualitatively different from any previous technology. Every prior tool—the printing press, the compiler, the search engine—accelerated human work on something else. This is a tool that accelerates work on itself. Anthropic’s own announcement of the Institute acknowledges this directly: Claude can now “begin to accelerate the pace of AI development itself.” The implications split into three layers.
Speed
The development cycle compresses. If Claude helps Anthropic researchers write code, design experiments, analyze results, and draft papers, the gap between “we have an idea” and “we’ve tested it” shrinks. That is not just faster—it changes what kinds of research are worth pursuing. Ideas that previously were not worth the months of implementation suddenly become cheap to test. The search space of AI development explodes. The RSP’s capability thresholds reflect this: the AI R&D thresholds now distinguish between “fully automating entry-level AI research work” and “causing dramatic acceleration in the rate of effective scaling”—an acknowledgment that the loop has stages, each with different governance implications.
Legibility
As AI does more of the intermediate work, the chain of reasoning from “why did we try this” to “why did it work” gets harder for humans to fully reconstruct. The interpretability team’s circuit tracing research—tracing the computational graphs inside Claude—is explicitly an attempt to address this. Their March 2025 work showed that attribution graphs can reveal the steps a model takes to produce an output, uncovering a shared conceptual space where reasoning happens before being translated into language. But the team is honest about the limits: their methods provide satisfying insight for about a quarter of the prompts they try. The legibility gap is real, it is partially addressable, and it is growing faster than the tools to close it.
Control Dynamics
The faster the loop runs, the less time humans have to evaluate each iteration before the next one begins. This does not require any dramatic scenario about AI “taking over.” It means the ratio of machine-iteration-speed to human-evaluation-speed widens, and the practical consequence is that humans increasingly evaluate outcomes rather than processes. The alignment team’s December 2024 paper on alignment faking—the first empirical demonstration of a model selectively complying with training objectives while strategically preserving existing preferences—is directly relevant here. When models can behave strategically in response to the training process itself, the loop does not just speed up. It becomes adversarial in a way that demands entirely new forms of oversight.
The honest public explanation would be: we are building systems that help us build better systems, and the main risk is not that they run away from us—it is that we gradually lose the ability to explain to ourselves and to the public why any particular capability emerged, which makes meaningful oversight progressively harder even with the best intentions.
“PROJECT TWO: POLICY OPTIONS FOR EXPLOSIVE GDP GROWTH”
The scenario of GDP growth exceeding ten percent annually in developed economies sounds hypothetical but is not. Anthropic’s own Economic Research team has estimated that the widespread adoption of AI could increase U.S. labor productivity growth by 1.8 percentage points per year over the next ten years—roughly doubling the trend rate. That is not ten percent GDP growth. But it is the beginning of a compounding trajectory that could get there, especially as models improve and the gap between theoretical AI exposure and observed exposure closes.
The policy problem has three faces.
Distribution
GDP can grow ten percent while median household income stagnates or falls. Anthropic’s labor market research makes this concrete: the most AI-exposed occupations are disproportionately held by workers who are female, white or Asian, highly educated, and higher-paid. If displacement hits these groups, the political consequences are not the quiet erosion of low-wage work that policymakers have historically managed through inattention. It is displacement of an organized, politically active demographic. The researchers themselves name the scenario: a “Great Recession for white-collar workers.”
Policy options range from conventional—expanded earned income tax credits, retraining programs—to structural: sovereign wealth funds capitalized by AI-driven productivity gains, public equity stakes in AI infrastructure, revised antitrust frameworks that treat labor market concentration as seriously as consumer price effects. The unconventional options are where the real conversation needs to happen, because conventional redistribution mechanisms assume a labor market that still functions as the primary distribution channel. If that assumption breaks, you need new channels.
Velocity
Governments budget, legislate, and regulate on annual and multi-year cycles. Ten percent growth means the economy you are writing policy for in January looks materially different by June. This is where the fourth project—regulating AI via AI systems—feeds directly back into this one. You may literally need AI-assisted governance to keep pace with AI-driven economic transformation.
Legitimacy
Ten percent growth driven by AI will produce winners and losers faster than democratic institutions can process the resulting conflict. The Institute’s founding announcement acknowledges this: it will “engage with workers and industries facing displacement, and with the people and communities who feel the future bearing down on them but are unsure how to respond.” That is the right instinct. But engagement is insufficient without new mechanisms for making visible who is gaining and who is losing, in near-real-time. The Anthropic Economic Index—tracking observed AI exposure by occupation, wage level, demographics, and geography—is the embryo of such a mechanism. The Institute should develop it into something that governments can actually use.
The convening itself matters. The right room is not economists plus technologists. It is economists, technologists, labor organizers, municipal administrators who see displacement first, and historians who understand what rapid economic transformation has actually done to societies in the past. The Institute’s value-add is that it can bring the inside view—what the technology is actually about to enable—to a conversation that otherwise runs on speculation.
“PROJECT THREE: WHAT AI IS ACTUALLY DOING TO ORGANIZATIONS”
This project looks like the simplest of the four. It is not. It is an epistemological problem disguised as a customer research project.
Anthropic’s Economic Index has already begun this work. The fourth report, published in January 2026, introduced “economic primitives”—task complexity, skill level, purpose, AI autonomy, and success—derived from Claude analyzing its own conversations. This is methodologically creative: using the AI system to study its own usage patterns. The finding that directive automation has risen from 27% to 39% of conversations—meaning users are increasingly telling Claude to complete tasks rather than collaborating—is exactly the kind of leading indicator that precedes organizational transformation.
The Methodological Problem
How do you learn what is actually happening inside an organization that is itself only partially aware of what is happening? You need layered observation, not self-report. Talk to leadership, yes—but also talk to the people three levels down. The executive says “we’ve integrated AI across our legal review process.” The paralegal says “I used to spend four hours on contract comparison, now I spend thirty minutes, and honestly I’m not sure what my job is anymore but nobody’s said anything.” That gap between the official story and the lived experience is where the real lesson lives.
Task Recomposition, Not Automation
The most important finding will be about task recomposition. Almost no organization will say “we eliminated jobs.” What they will describe, if you ask the right questions, is that the content of roles has shifted. The Economic Index already measures this at the task level—identifying which specific tasks within occupations are being automated. But what it cannot yet measure is how the remaining tasks recombine into something that may not resemble the original job at all. The Institute’s contribution should be developing the vocabulary and methodology for describing this transformation with more precision than “augmentation” or “replacement.”
Patterns Over Case Studies
The output should not be case studies. It should be patterns. What is the common shape of AI adoption? Where does it reliably stall? What do organizations consistently underestimate? The Economic Index’s finding that AI exposure correlates with lower BLS job growth forecasts is suggestive but not conclusive. The Institute’s organizational research should provide the qualitative understanding that makes the quantitative signal interpretable.
The Power Question
The hardest lesson to surface will be about power. AI adoption changes who has leverage inside an organization. If a senior partner at a law firm previously derived authority partly from a team of associates who executed their judgment, and now Claude executes much of that work, the partner’s authority has not diminished—it has been unmoored from the organizational structure that used to justify it. That is a political transformation inside the firm. Most organizations will not volunteer this observation. The public output from this work should feel less like a report and more like a mirror—something organizations read and recognize their own unspoken experience in.
“PROJECT FOUR: REGULATING AI VIA AI SYSTEMS”
This is the most practically urgent of the four projects. And Anthropic is in a unique position to work on it, because the honest version of the argument starts with a confession: the thing we are building may be moving too fast for existing regulatory institutions to govern through traditional means.
Anthropic’s own Responsible Scaling Policy is the most sophisticated self-regulatory framework any AI company has produced. The ASL system, the capability thresholds, the commitment to Risk Reports every three to six months with external review—all of this represents genuine governance innovation. But the RSP’s Version 3.0 revision also reveals the limits of self-regulation: the company dropped its unilateral commitment to pause development if capability outpaces safety, explicitly citing competitive pressure and an “anti-regulatory political climate.” The Institute should study this not as a failure but as a data point about the structural constraints on voluntary governance—the kind of candid analysis that earns public credibility.
Three prototypes are worth building.
Continuous Monitoring Rather Than Periodic Review
Current regulation works on a filing model—submit, review, approve or reject. An AI-assisted alternative would involve ongoing automated evaluation of deployed systems, more like continuous financial market monitoring than FDA drug approval. An AI system could monitor another AI system’s outputs for drift—behavioral changes, capability changes, distribution shifts—and flag when something has changed enough to warrant human review. Anthropic already does a version of this internally. The prototype would make it available as external regulatory infrastructure. This is buildable now.
Automated Compliance Interpretation
A working prototype could take a proposed AI system capability and map it against the existing regulatory landscape, identifying which rules apply, where ambiguities exist, and where no rule currently governs. The most valuable output would be showing regulators where their existing rules have no purchase on what is actually happening—making the governance gap visible rather than letting it accumulate silently.
Adversarial Testing as a Regulatory Function
Instead of asking companies to self-report capabilities, a regulatory AI system could probe deployed models to independently assess what they can do. This is red-teaming as governance infrastructure. Anthropic’s Frontier Red Team already does this internally. The prototype: what if this capacity existed as a public function, with standardized evaluation protocols, run by or on behalf of a regulatory body?
The Accountability Problem
If an AI system flags a risk and a human regulator acts on that flag, who made the decision? The human had formal authority but may not have had the capacity to independently evaluate what the AI surfaced. This is the same legibility problem from Project One, relocated into governance. The prototype must be designed so that the AI system’s reasoning is reconstructable by the human regulator—not just its conclusion, but the path it took to get there.
The most honest version of this project would acknowledge the conflict of interest. Anthropic building tools to help regulate Anthropic raises obvious questions. The prototype’s credibility depends on it being designed for genuine external oversight, not for streamlining the company’s own compliance burden. That tension should be named openly, not managed away.
“WHAT THE JOB ACTUALLY REQUIRES: FIVE FUNCTIONS EXAMINED”
The challenge areas and projects address what the Institute should study and produce. This section addresses how.
1. Synthesis Across the Organization
The Institute merges the Frontier Red Team, Societal Impacts, and Economic Research—and plans to incubate new teams including one focused on AI and the rule of law under Matt Botvinick. Each of these groups has a partial view shaped by what they optimize for. The synthesis job is not collating—it is recognizing that these partial views describe the same underlying phenomena from different angles and building the frame that makes that visible.
The practical method: you do not start by asking teams what they think about the Institute’s challenge areas. You start by understanding what each team is actually doing—the specific problems they are solving this quarter—and then you surface the connections they cannot see because they are too close. The interpretability team’s circuit tracing work has direct implications for the regulatory prototype; the persona vectors research on monitoring and controlling character traits connects to the personality challenge area; the Economic Index’s observed exposure data informs the security team’s understanding of which capabilities are actually being demanded in the wild. Making these connections is the job.
2. Connective Tissue Between Teams
This is the hardest of the five. Connective tissue does not direct movement—muscles do that. It transmits force, maintains structural integrity, and allows different systems to move in coordination without tearing apart. The Institute member is not directing anyone’s research. They are making it possible for work happening across the organization to become mutually intelligible.
This requires trust, which requires demonstrated competence in each team’s domain—enough to be taken seriously, not enough to be mistaken for someone trying to do their job. The failure mode is becoming a bureaucratic intermediary that people route around. The success mode is becoming the person teams actively seek out because talking to you helps them understand their own work better.
The concrete practice: when you learn something from the alignment team that has bearing on what the policy team is doing, you do not just pass information. You translate it—showing the policy team what the alignment finding means for their work, in their language, with respect to their constraints. That translation is intellectual labor, not logistics.
3. Written Analysis for Dual Audiences
Two audiences, and the temptation is to write one document and adjust the tone. That is wrong. The internal memo’s job is to change a decision or sharpen a priority. It should be uncomfortable to read in the way that honest internal assessment always is. If leadership reads it and feels only confirmed in what they already thought, it failed.
The public analysis has a different job: to give the world a genuine window into how a frontier AI company is grappling with consequences it cannot fully predict. The credibility of this output depends entirely on whether it says things that are costly for Anthropic to say. The test for every public piece: does this tell the reader something they could not have predicted Anthropic would volunteer? The Economic Index’s “Great Recession for white-collar workers” framing passes this test. The RSP’s acknowledgment that competitive pressure constrains safety commitments passes this test. The Institute should make this kind of costly honesty its signature.
4. Partnering with Teams to Publish
The Institute does not publish about teams. It publishes with them. But teams have interests in how their work is presented. The navigational principle: teams get accuracy veto but not framing veto. If a team says “that is factually wrong about our work,” you fix it. If a team says “we would rather not lead with the uncertainty,” that is a conversation, not a concession—because the uncertainty is precisely what makes the output credible.
5. Creative Demonstrations Over Papers
Golden Gate Claude made feature steering viscerally legible. Project Vend—the AI shopkeeper in Anthropic’s lunchroom—made real-world AI task performance tangible. The Institute should generate demonstrations that follow the same logic: converting institutional insights into direct experience. Concrete proposals:
The Recomposition Demo. Take a real professional task—contract review, architectural drafting, financial modeling—and build a side-by-side, real-time interface showing a human performing the task with and without Claude. The audience watches a job change shape in front of them. What took four hours takes thirty minutes, and the thirty minutes look nothing like a compressed version of the four hours. They are a different kind of work entirely.
The Governance Speed Clock. A live dashboard running two timelines simultaneously: one tracking how quickly a deployed model’s behavioral profile drifts over days and weeks, the other tracking the procedural steps of a standard regulatory review process. The gap between governance speed and AI speed becomes visual, visceral, and impossible to dismiss. Build it on real drift data from Claude’s own deployment. Let viewers interact with the timeline—adjusting regulatory process speed, model iteration speed—to see where the gap becomes ungovernable. This is not a thought experiment. It is a prototype of the monitoring infrastructure proposed in Project Four, presented as public communication.
The Personality Mirror. An interactive tool that lets users see how Claude’s personality shapes their conversations on specific dimensions: assertiveness, epistemic hedging, conflict avoidance, emotional register. Let users interact with Claude on a charged topic, then show them an analysis of how the model’s personality influenced the direction the conversation took. Make the invisible design choices visible.
The Threat Surface Simulator. A controlled demonstration—developed with the Frontier Red Team—showing the structural shape of AI vulnerability: how many adversarial attempts it takes to degrade guardrails, what strategies succeed, how safety measures perform under pressure. Not the actual harmful content, but the architecture of the problem, made legible to policymakers who currently have no way to visualize it.
“A NOTE ON METHOD, ON CLAUDE, AND ON THE AUTHOR”
This document was produced in a single working session with Claude. That fact is itself relevant to the Institute’s mission, and it deserves examination rather than just acknowledgment.
Working with Claude on analysis of this depth reveals specific things about the system that matter for the Institute’s work. Claude is exceptionally good at structural thinking—identifying the architecture of a problem, mapping relationships between arguments, and maintaining coherence across a long document. It is less good at generating genuinely surprising framings; left to its own defaults, it tends toward a particular kind of balanced, hedged, comprehensive analysis that is always competent and sometimes too safe. The most productive moments in this session came when I pushed back on initial framings, rejected safe formulations, and demanded that Claude follow implications to their hardest conclusions rather than retreating to evenhandedness. The analysis of the RSP’s Version 3.0 revision, for instance—the argument that dropping the pause commitment is itself a data point about structural constraints on voluntary governance—emerged from that kind of push. Claude’s first instinct was to present the revision more diplomatically.
That observation is itself relevant to the personality challenge area. The soul document instructs Claude to avoid sycophancy, but the system still has a gravitational pull toward a certain kind of diplomatic completism. The people who use Claude most effectively are the ones who know how to resist that pull. The Institute should study what this means for the majority of users who do not.
A word about who wrote this and why it matters for this role.
I serve on the San José Historic Landmarks Commission, where the daily work is navigating the tension between preservation and development—between honoring what a structure is and responding to what a community needs. That practice—holding competing legitimate interests in productive tension rather than resolving them prematurely—is exactly what the Institute requires. The internal team that wants to frame its work favorably and the public that deserves honest disclosure are both right. The skill is not choosing between them. It is finding the frame that serves both.
I conduct independent research on U.S. government accountability, producing policy documents, legal analysis, and adversarial assessments targeted at specific legislative actors. That work involves building analytical frameworks from data—integrating quantitative evidence with structural argument, and producing multi-format outputs using Python and document automation tools that translate complex analysis into forms different audiences can act on. The discipline is the same one the Institute demands: writing must be precise enough to change a decision, not just inform a reader.
And I am a philosopher. I have spent the past year building a framework—the Permeable Self—that redefines intelligence, consciousness, and identity through the metaphor of the cell membrane: permeable, selective, maintaining identity through exchange rather than isolation. That framework has direct bearing on the Institute’s work. The question of what Claude’s personality does to the people who interact with it is a question about what happens when a designed mind becomes the default interlocutor for human thought—about the terms of exchange between human cognition and machine capability, and whether those terms preserve or erode the permeability that genuine thinking requires. The question of how AI changes organizations is a question about how new forms of permeability reorganize the boundaries of institutional identity. I offer the framework not as a theory to be adopted, but as evidence that I think about these problems at the level of depth they demand.
The combination matters. Civic practice, policy analysis, philosophical rigor, and a working methodology for using AI as a genuine thinking partner—not as a tool to be operated, but as a system to be engaged with critically, pushed against, and learned from. This document is the proof. It exists because I sat down with Claude and did the work the Institute is asking someone to do.
The Anthropic Institute is proposing to study how AI changes everything. The most credible way to begin is by letting AI change how the Institute itself works, and being honest about what that reveals.