Do you know the paper The Algorithmic Self-Portrait: Deconstructing Memory in ChatGPT?

 

Yes — it's a recent paper (arXiv 2602.01450), published at the ACM Web Conference 2026 (WWW '26, Dubai, April 2026). Lead author is Abhisek Dash, with Savvas Zannettou among the co-authors. Here's a concise account of what it does and finds:

What it is: An empirical study of ChatGPT's memory feature, conceptualised as an Algorithmic Self-Portrait — a new form of personalisation derived from users' self-disclosed information divulged within private conversations.

Dataset: Data from 80 participants recruited from Prolific who exercised their GDPR right of access and voluntarily donated their ChatGPT interactions — 1,058 conversations, 22,971 turns, 2,050 memory entries.

Four research questions: Who holds greater agency in updating memory (user or system)? What information does the system store? How faithfully does memory represent conversations? Can memory inference be reverse-engineered to mitigate privacy risk?

Key findings:

  1. 96% of memories in the dataset are created unilaterally by the conversational system, potentially shifting agency away from the user.
  2. Memories contain a rich mix of GDPR-defined personal data (in 28% of memories) along with psychological insights about participants (in 52% of memories). The paper uses a Theory of Mind framework to characterise these psychological captures as characteristic adaptations — deep dispositional traits, not just surface preferences.
  3. 84% of memories are directly grounded in user context, indicating generally faithful representation of conversations — though there is systematic logical inference and extrapolation (e.g., "User asked for bands like Nirvana" → memory stored as "User likes Nirvana").
  4. Using open-source LLMs (Qwen2.5-32B, Gemma3-27B, GPT-OSS-20B) with in-context learning and fine-tuning, the researchers were able to imitate ChatGPT's memory extraction at ~60% semantic similarity with ground-truth memories.

The mitigation: They build Attribution Shield, which reverse-engineers the memory generation process to alert users to sensitive memories and reformulate queries to reduce personal attribution while preserving utility. Over 94% of reformulated queries successfully reduced attribution risk.

The broader concern: These portraits are constructed with higher algorithmic autonomy and capture users' deep psychological frameworks, raising profound security and privacy concerns. The opacity of the process — users largely unaware that memories are being created, let alone what they contain — is the central governance issue the paper surfaces.

It's essentially the first systematic empirical audit of conversational AI memory through a privacy and agency lens, using GDPR data donations as the access mechanism. Would you like to discuss it in relation to the MCI framework you've just loaded — it connects quite directly to several of V1's legitimacy and non-domination concerns?


Comments

Popular posts from this blog

How to use the Mature Constitutional Intelligence (MCI) Framework:

MCIv1 rev4

MCIv5 rev4

MCIv7 rev4

MCIv6 rev4

MCIv4 rev4

MCIv3 rev4

MCIv9 rev4

MCIv2 rev4

MCIv8 rev4