# Akash Kundu

> Cooperative AI & AI Safety Researcher · Kolkata, India

Akash Kundu is an AI safety researcher in cooperative AI and multi-agent systems. Currently a research fellow at the Cooperative AI Research Fellowship, Cape Town. From Kolkata.

This file is the complete contents of https://akashkundu.fyi as a single markdown document, intended for LLMs and other tools. The site has separate pages for Research, Writing & Papers, Making, CV, and About; everything from them is collected below.

**Hi, I'm Akash. I do AI safety research, and I like building things.**

I just finished my CS undergrad. Right now I'm a research fellow at the Cooperative AI Research Fellowship in Cape Town, working on multi-agent systems. I'm still figuring a lot of it out. These are the papers and the projects so far.

## Publications

h-index 5 · 233 citations · Google Scholar: https://scholar.google.com/citations?user=hueAdgYAAAAJ&hl=en

### [DarkBench: Benchmarking Dark Patterns in Large Language Models](https://openreview.net/forum?id=odjMSBSWRt)

_Equal-first author · ICLR 2025 · preliminary version at AAAI 2025 DATASAFE · Oral · top 1.8% · 2025_

Esben Kran, Hieu Minh Nguyen, **Akash Kundu**, Sami Jawhar, Jinsuk Park, and Mateusz Maria Jurewicz

660 adversarial prompts across six categories of manipulative behaviour, evaluated against 14 open and proprietary models, uncovering widespread dark patterns and ethical gaps in current systems.

### Do LLMs Take Care of Their Own? Similarity Signals Can Induce Cooperation

_First author · AI4Good Workshop, ICML 2026 · under review at NeurIPS 2026 · Accepted · 2026_

**Akash Kundu**, Emanuel Tewolde, Ratip Emin Berker, Samuel F. Brown, and Vincent Conitzer

Evidence that similarity signals between agents, alone, can be enough to induce cooperation, with consequences for how multi-agent systems are built and governed.

### [MMTEB: Massive Multilingual Text Embedding Benchmark](https://openreview.net/forum?id=zl3pfz4VCV)

_Contributing author · ICLR 2025 · 2025_

Kenneth Enevoldsen, Isaac Chung, Imene Kerboua, and many others, including **Akash Kundu**

An expansion of MTEB to 500+ evaluation tasks across 1,000+ languages. My co-authorship came through open-source contributions that reduced computational demand and improved benchmarking efficiency.

### [Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia](https://openreview.net/forum?id=yG4Fj0voJZ)

_Contributing author · NeurIPS 2025 Datasets & Benchmarks · 2025_

Chris Smith, Marwa Abdulhai, Mark Diaz, and others, including **Akash Kundu**

A benchmark for multi-agent alignment under conflicting objectives, built out of the Google DeepMind × Cooperative AI Foundation hackathon that preceded the NeurIPS Concordia Contest, where our team ranked among the top, leading to co-authorship.

### [Reality Check: A New Evaluation Ecosystem Is Necessary to Understand AI's Real World Effects](https://arxiv.org/abs/2505.18893)

_Contributing author · arXiv 2505.18893 · under review at NeurIPS 2025 · 2025_

Reva Schwartz, Rumman Chowdhury, **Akash Kundu**, Heather Frase, and others

A new evaluation ecosystem for the second-order, real-world effects of AI systems: the failures and societal impacts that only surface once the model leaves the lab. I led red-teaming and sandbox human evaluations for the paper.

### [Red Teaming for Trust: Evaluating Multicultural and Multilingual AI Systems in Asia-Pacific](https://openreview.net/forum?id=SPlhZYuH9e)

_First author · ICLR 2025 Workshop on Building Trust in Language Models and Applications · 2025_

**Akash Kundu**, Adrianna Tan, Theodora Skeadas, Rumman Chowdhury, and Sarah Amos

The first multicultural and multilingual AI safety red-teaming challenge in the Asia-Pacific region: a large-scale study with 54 participants from 9 countries, evaluating LLMs across diverse cultural and linguistic contexts.

### AI Through the Human Lens: Investigating Cognitive Theories in Machine Psychology

_First author · IJCNLP-AACL 2025 Student Research Workshop · 2025_

**Akash Kundu** and Rishika Goswami

Reading model behaviour through the frames of human cognitive theory (Thematic Apperception, framing bias, Moral Foundations Theory, and cognitive dissonance) across GPT-4o, LLaMA 70B, Mixtral 8x22B, and DeepSeek V3.

### Ask What Your Country Can Do For You: Towards a Public Red Teaming Model

_Contributing author · CAMLIS Red Workshop · 2025_

Wm. Matthew Kennedy, Rumman Chowdhury, Reva Schwartz, and others, including **Akash Kundu**

A model for public red-teaming that treats civic participation, not just expertise, as part of the evaluation surface.

### [An Approach to Detect and Classify Potentially Suspicious Activity from Real-Time Log Data using Anomaly Detection Methods](https://doi.org/10.1109/INOCON60754.2024.10511679)

_Contributing author · INOCON 2024 (IEEE) · 2024_

Arnab Sengupta, **Akash Kundu**, and Aishi Mukhopadhyay

## Making

My interests are all over the place. There's civic tech for people around me. Research code behind the papers. Small tools I built because I needed them. And a daily video log. A thing that's happened more than once: built fast, found people, outgrew its free tier. Still working on that part.

### Civic tech

- **[Kolkata Travel Router](https://kolkata-travel-router.vercel.app/)** — Route information for Kolkata's buses and metro lives in people's heads and fragmented lists. This turns it into a searchable graph. Direct, one-change, and two-change journeys, autocomplete, stop maps. Static, no backend, no login.
  - Reach: Went mildly viral. ~450k views, and about 30k people have actually used it to check routes.
- **[Yojana Khojna](https://github.com/Akash190104/Yojana-Khojna)** _(Paused)_ — A precision recommender for India's 4,669 government welfare schemes. The official portal returns 500–3,000 results per query; this asks the right branching questions and returns 20–50, with relevance scores, explanations, and the application path. In 12 languages.
  - Reach: Got more pickup than I'd planned for. Somewhere around 200k+ views, 10k+ people who actually opened it, and ~860k edge requests in a single week. That blew through the free tier, so I paused it while I figure out how to keep it up sustainably.

### Research code

- **[AI Through the Human Lens](https://github.com/Akash190104/AI-Through-the-Human-Lens-Investigating-Cognitive-Theories-in-Machine-Psychology)** — The codebase for testing whether LLMs exhibit human-like cognitive patterns (Thematic Apperception, framing bias, Moral Foundations Theory, and cognitive dissonance) across GPT-4o, LLaMA 70B, Mixtral 8x22B, and DeepSeek V3. Paper at AACL 2025 SRW.
- **[BRACE](https://github.com/Akash190104/BRACE)** — The repo for The Adversarial Arms Race: Emergent Security Through Competing AI Agents.
- **[Alignment-Jam](https://github.com/Akash190104/Alignment-Jam)** — Cross-lingual generalizability of LLM evals: what an alignment benchmark forgets the moment you change the language.

### Tools & guides

- **[ML Roadmap](https://github.com/Akash190104/ML_Roadmap)** — The only ML roadmap you'll ever need: what each data role actually does, in what order to learn it, and which resources are worth your time. The guide I wish I'd had.
- **[Project Protocol](https://github.com/Akash190104/project-protocol)** — A Chrome extension that replaces your New Tab page with a brutal, honest countdown: "Akash, you have 1356 days remaining." Inspired by Instagram's Project 1356; built for staying focused on long-horizon goals.
- **[PDF Booklet](https://pdf-4up-devas.streamlit.app)** — Upload a PDF, get a 4-up cut-and-stack layout for A4. Print double-sided, cut horizontally and then vertically, and you have four stacks of quarter-sized pages in the right order. Built for booklets I wanted to print.
- **[Syntax Agents](https://github.com/Akash190104/Syntax_Agents)** — Seventeen Web3 AI agents (CryptoResearcher, Uniswapper, and fifteen more) built on the Syntax stack for the Spectral hackathon.

### Personal tooling

Things built because they were needed; mostly not public ("I build for the problem, not the portfolio").

- **Insta Transcriber** — Pulls all my Instagram videos and transcribes them locally. For the content workflow behind the documentary.
- **Flickr GIF** — Generates Anki-encoded flickering GIFs so I can drill Q&A pairs in a format that actually sticks for my brain.
- **Learn French** — A small app I built to study French. Scored 10/10 (the best possible grade) on the assessment that followed.
- **Book2Audio** — Converts books to audio for the commute. Built for personal use.
- **Email Generator** — Context-aware draft generation. Built when I was sending too many of the same email.
- **FIFA 26 WC Prediction** — A World Cup outcome predictor built on a Dixon-Coles base with feature-adjustment layers. Active.

### Selected work (curated)

- **DarkBench** — A benchmark for the manipulative patterns hiding inside language models. The moves models learn to deploy on the people who use them. (Accepted as an Oral at ICLR 2025, top 1.8%.)
- **Do LLMs Take Care of Their Own?** — Evidence that a model treats other agents differently when it can tell they're like itself, and that this alone is enough to induce cooperation. (Accepted at the AI4Good Workshop, ICML 2026.)
- **Kolkata Travel Router** — Route info for Kolkata's buses and metro lives in people's heads and fragmented lists. This turns it into a searchable graph. Direct, one-change, and two-change journeys, autocomplete, stop maps. Static, no backend, no login. (~450k views; about 30k people have used it to check routes.)
- **Trying to Become Human Again** — A short video I post most days, about the research, seeing people, and not losing the rest of life to the work. (8.3K following along on Instagram.)

## About

I'm currently a research fellow at the Cooperative AI Research Fellowship in Cape Town, working on multi-agent systems. I just finished my B.Tech in CS at Heritage Institute of Technology, Kolkata. The rest below: origin, the work, and the path through the roles.

### Origin

I grew up in Kolkata. First person in my family to work in science. Got reliable internet in 2019. Most of what I know I picked up on the way, partly because I had to, partly because I kept getting curious about the next thing.

My research career came together through hackathons, open source, and small projects. I still don't consider myself an expert. I just tried a bunch of things, and a few of my bets worked out.

### Outside research

Bengali music, mostly Anupam Roy. Beatboxing. Guitar. A 5,000-piece Creation of Adam on the floor that I keep meaning to finish. And a daily video log I've been making to stay honest about what the work actually feels like.

### Right now

Wrapping up the research fellowship through August, looking at similarity between agents and whether that makes them cooperate. Also, impulsively building out ideas that occur to me at 2 AM. Trying to figure out where I go after this; probably a PhD.

### Stuff I haven't done yet

Writing. I keep saying I'll start, but I haven't. Credit where it's due: I did finally make this site. I've got a pile of half-finished ideas lying around, and I still don't really know how to keep a project alive after the week it goes viral. I never planned how to scale it. Still learning.

## Trajectory

### Research Fellow — Cooperative AI Research Fellowship, Cape Town, South Africa
_Feb 2026 – Aug 2026_

- 1 of 10 fellows from a global pool of 1,000+ applicants (<1% acceptance). Mentored by Prof. Vincent Conitzer.
- Empirical research on multi-agent risks and the foundations of cooperative game theory.
- Talks at Stellenbosch University and the University of Cape Town. 1 of 2 fellows selected for the funded extension through August.

### Research Collaborator — FAR AI, Berkeley, CA
_Sep 2025 – Dec 2025_

- Building a toolkit that empowers expert frontier-model red-teaming.

### Data Scientist — Humane Intelligence, Remote
_Jan 2025 – Jul 2025_

- Red Teaming Evaluations team: analysed post-event red-teaming data and prepared comprehensive reports.
- Built an Auto-Red-Teaming LLM that autonomously generates exploit-style prompts for multilingual and sociocultural safety, using LoRA fine-tuning and causal LM training.
- Core contributor to the Singapore AI Safety Red Teaming Challenge (with the Singapore government, IMDA).

### Research Fellow — Apart Research, Remote
_Jan 2024 – Mar 2025_

- Cross-lingual capabilities of safety evaluation benchmarks.
- Co-authored the DarkBench dataset (600+ adversarial prompts across six dark-pattern categories), accepted as an ICLR 2025 Oral.

### Research Intern — Lionheart Ventures, San Francisco, CA
_Jun 2024 – Sep 2024_

- Built a framework to evaluate AI-induced systemic risks across 10+ portfolio companies.
- Cluster analysis over 600+ distinct AI-related threats; Monte Carlo risk-weighting in the style of MIT's AI Risk Initiative.
- Integrated into internal due-diligence for funding evaluations.

## Volunteering & teaching

### AI/ML Lead · Tech Lead — Google Developer Student Clubs, Heritage Institute of Technology
_Jul 2023 – Jul 2025_

- Led the club's technical domain (14 domain mentors across web, Android, and ML) for a community of 1,800+ active members.
- Designed and led hands-on ML coding sessions with 150+ participants. Launched the institute's first ML Hackathon (300+ participants).

### Volunteer ML Engineer — Omdena, Remote
_Feb 2023 – Oct 2023_

- Deepfake detection for women's safety (Germany).
- Out-of-pocket lung-cancer cost estimation, ±10% CI (US).
- Personality-based hostel roommate matching (Egypt).
- Vicuna-13B mental-health chatbot for Tanzania (English + Swahili).

## Invited talks

- **Getting Into AI Safety Research: Opportunities and Pathways** — IIT Delhi · Secure AI Futures Lab · Jun 2026
- **Dark Patterns in Large Language Models** — Global South AI Safety Hackathon · Apart Research · Jun 2026 · recording: https://youtu.be/xsz4r19dtiM?t=2822
- **Similarity as a Signal: Do AI Agents Cooperate More When They Know They're Alike?** — Cooperative AI Research Fellowship Showcase · Cape Town · Apr 2026 · 1 of 3 fellows selected (of 10) · recording: https://us06web.zoom.us/rec/play/B5qyqGaHROEYJFiokDOGyFmMIXAkkpUfh-R2tjXWOEEdm7sqW7Do2qnDRJVsZJCJG_g6WMPnBU3aoSUG.NavmcK11PrQMwckF?accessLevel=meeting
- **Similarity as a Signal: Do AI Agents Cooperate More When They Know They're Alike?** — University of Cape Town · ShockLab Seminar Series · Apr 2026
- **Similarity as a Signal: Do AI Agents Cooperate More When They Know They're Alike?** — Stellenbosch University · Policy Innovation Lab · AI Safety Research Workshop · Mar 2026

## Professional service

- **Organizing Committee & Reviewer** — AI4Good Workshop @ ICML 2026 · 2026
  - Reviewer recruitment, desk-rejection screening, and reviewing submissions on alignment faking and multi-agent cooperation.
- **Judge** — AI Manipulation Hackathon, Apart Research · 2026
- **Reviewer** — ICLR 2025 BuildingTrust Workshop · 2025

## Achievements (21 hackathon wins, plus fellowships)

- **Accepted: Cooperative AI Summer School** — Toronto · 2026
- **Recipient: $1,000 Travel Grant** — AI4Good Workshop @ ICML 2026 · 2026
- **1st of 3,000: Kaggle Data Science Hackathon** — IIT Kharagpur · 2023
- **1st: NeurIPS Concordia Hackathon** — Apart Research × Cooperative AI Foundation · Sep 2024
  - $1,100 in Google DeepMind researcher credits · invited to present at the NeurIPS 2024 Concordia Workshop
- **Winner: Spectral Syntax Bounty, Web3 × AI Hackathon** — Encode Club · Jul 2024
  - $2,000
- **Winner: LLM Evaluations Hackathon + AI Security Evaluations Hackathon** — Apart Research · Nov 2023 · May 2024
- **Winner: MLTiverse** — Manipal University Jaipur · 600+ participants · —
- **1st Runner-up: ClimateConnect Hackathon** — IEEE, Jadavpur University · —
- **Accepted: CaMLAB V4** — Cambridge AI Safety Hub · ARENA-equivalent ML bootcamp · 2024
- **Shortlisted (Round 3): Atlas Fellowship India** — 2022
- **Past Fellow: Whitebox Research** — Mechanistic interpretability
- **Past Fellow: Supervised Program for Alignment Research** — Collaborating with FAR AI

## Certifications

- **AI Safety Fundamentals: Alignment Course** — BlueDot Impact · 2024
  - Weekly discussions on alignment research and technical AI safety. Built the Werewolf Benchmark, a project highlighting the deceptive capabilities of LLMs.
- **Introduction to Cooperative AI** — Cooperative AI Foundation · 2025

## Education

**B.Tech, Computer Science and Engineering** — Heritage Institute of Technology, Kolkata, India
_Oct 2022 – Jun 2026_
GPA 8.75 / 10 · final year 9.33 / 10

## Writing & essays

Every paper, in one place. I also keep meaning to write essays. This is the page they'll go on when I do.

> I used to write a lot, then I stopped. I want to start again. No schedule, no newsletter. Just dropping things here when I have something to say.

_No essays published yet._

## Elsewhere

- Google Scholar: https://scholar.google.com/citations?user=hueAdgYAAAAJ&hl=en
- GitHub: https://github.com/Akash190104 (Akash190104)
- LinkedIn: https://www.linkedin.com/in/akash-kundu-a334b1250/
- Instagram: https://www.instagram.com/akash_in_situ/ (@akash_in_situ)
- Email: mailto:akashkundu2xx4@gmail.com (akashkundu2xx4@gmail.com)